Took 
Seafan Consensus
http://aquacul4.fish.washington.edu/~steven/ballyhoo/Query_SeaFan_Consensus90230.fa

and blasted against

Transcripts: transcripts.Nemve1FilteredModels1.fasta.gz

OUTPUT
http://aquacul4.fish.washington.edu/~steven/ballyhoo/blast.coral-20120405.out


….. 
imported into Galaxy..
~1.3 million lines

c11<=0.00001 
evalue


33k lines
--
c11<=1e-20
14,905 lines


Galaxy110-[Filter_on_data_108].tabular

5375 of Nemostella had significant matches (<=1e-20) in SeaFan transcriptome.

There are 27,273 transcripts in Nemostella



Filter on bitscore >45
40,263 lines


11304 unique Seafan contigs had matches with 9496 Unique Nemo transcripts.






--

--
cdhit-EST - Nemo

./cd-hit-est -i /Volumes/Bay4\ scratch/temp/transcripts.Nemve1FilteredModels1.fasta -o /Volumes/Bay4\ scratch/temp/transcripts.Nemve1FilteredModels1_cdhit -M 2500

Started: Thu Apr  5 11:29:41 2012
================================================================
                            Output                             
----------------------------------------------------------------
total seq: 27273
longest and shortest : 26238 and 150
Total letters: 29770590
Sequences have been sorted

Approximated minimal memory consumption:
Sequence        : 33M
Buffer          : 1 X 17M = 17M
Table           : 1 X 17M = 17M
Miscellaneous   : 4M
Total           : 72M

Table limit with the given memory limit:
Max number of representatives: 4194304
Max number of word counting entries: 303385279

OUTPUT
about 23K clusters

transcripts.Nemve1FilteredModels1_cdhit